Enumeration of Rooted Binary Unlabeled Galled Trees

Rooted binary galled trees generalize rooted binary trees to allow a restricted class of cycles, known as galls. We build upon the Wedderburn-Etherington enumeration of rooted binary unlabeled trees with n leaves to enumerate rooted binary unlabeled galled trees with n leaves, also enumerating rooted binary unlabeled galled trees with n leaves and g galls, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$0 \leqslant g \leqslant \lfloor \frac{n-1}{2} \rfloor $$\end{document}0⩽g⩽⌊n-12⌋. The enumerations rely on a recursive decomposition that considers subtrees descended from the nodes of a gall, adopting a restriction on galls that amounts to considering only the rooted binary normal unlabeled galled trees in our enumeration. We write an implicit expression for the generating function encoding the numbers of trees for all n. We show that the number of rooted binary unlabeled galled trees grows with \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$0.0779(4.8230^n)n^{-\frac{3}{2}}$$\end{document}0.0779(4.8230n)n-32, exceeding the growth \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$0.3188(2.4833^n)n^{-\frac{3}{2}}$$\end{document}0.3188(2.4833n)n-32 of the number of rooted binary unlabeled trees without galls. However, the growth of the number of galled trees with only one gall has the same exponential order 2.4833 as the number with no galls, exceeding it only in the subexponential term, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$0.3910n^{\frac{1}{2}}$$\end{document}0.3910n12 compared to \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$0.3188n^{-\frac{3}{2}}$$\end{document}0.3188n-32. For a fixed number of leaves n, the number of galls g that produces the largest number of rooted binary unlabeled galled trees lies intermediate between the minimum of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$g=0$$\end{document}g=0 and the maximum of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$g=\lfloor \frac{n-1}{2} \rfloor $$\end{document}g=⌊n-12⌋. We discuss implications in mathematical phylogenetics.


Introduction
Evolutionary histories of genes, populations, and species are often described by phylogenetic trees that seek to represent their descent relationships.Owing in part to the centrality of phylogenetic trees in evolutionary biology, mathematical studies have characterized numerous classes of phylogenetic trees, investigating their combinato-rial properties (Semple and Steel 2003;Felsenstein 2004;Gascuel 2005;Steel 2016;Warnow 2018).
The use of tree structures-typically treated as binary-is often appropriate for representing standard phenomena of evolutionary descent, by which biological entities sequentially bifurcate, in a manner in which diverged entities do not merge back together.Processes such as genetic admixture, horizontal gene transfer, and hybridization, however, produce evolutionary relationships that are not tree-like.These processes involve the merging of separate lineages that had previously descended from shared ancestors.With increasing interest in merging mechanisms during evolutionary descent, much recent attention in mathematical phylogenetics has been devoted to phylogenetic networks (Huson et al. 2010;Gusfield 2014;Kong et al. 2022), in which graphs describing relationships of biological entities permit certain types of cycles.
Among the simplest phylogenetic networks are the galled trees, named for the growths that can appear in plant tissues to produce distinctively shaped structures (Gusfield et al. 2003(Gusfield et al. , 2004b)).First introduced in studies of ancestral recombination graphs (Wang et al. 2001;Gusfield et al. 2003Gusfield et al. , 2004a, b;, b;Gusfield 2005Gusfield , 2014;;Song 2006), galled trees allow diverged lineages to merge forward in time, but only in circumscribed ways.Each merging event creates a gall corresponding to a cycle in the associated network.
From a standpoint that considers galled trees as mathematical objects separately from the processes that could produce them biologically, the defining feature of a galled tree is that cycles in a graph structure are disjoint, so that in a galled tree, a vertex or edge is contained in at most one cycle (Semple and Steel 2006).With this graph-theoretic sense for the meaning of galled trees, the enumerative combinatorics of galled trees has been investigated, both for unrooted and for rooted binary galled trees, focusing on galled trees that are leaf-labeled (Semple and Steel 2006;Bouvel et al. 2020;Cardona and Zhang 2020).Chang et al. (2018) and Mathur and Rosenberg (2023) have posed the problem of enumerating rooted binary galled trees in which the leaves are not labeled.In a study focused on introducing encodings for galled trees, Chang et al. (2018) argued that the number of rooted binary unlabeled galled trees with n leaves is bounded above by a sequence with a certain generating function.In an enumerative study of labeled histories for rooted binary leaf-labeled galled trees, Mathur and Rosenberg (2023) enumerated a class of rooted binary unlabeled galled trees for n from 1 to 6, obtaining 1, 1, 2, 6, 20, 72.These values are indeed bounded above by the corresponding upper bounds of Chang et al. (2018Chang et al. ( )-1, 1, 4, 28, 245, 2402 for n = 1 to 6-though Chang et al. (2018) used a more expansive definition of rooted binary unlabeled galled trees.They are also bounded above by the enumeration in Theorem 8 of Cardona and Zhang (2020) of the corresponding set of rooted binary labeled galled trees, which gives 1, 1,6,69,960,24,750 for n = 1 to 6.
How many rooted binary unlabeled galled trees possess a given number of leaves n and a given number of galls g? Here, we perform this general enumeration, counting the rooted binary unlabeled galled trees of Mathur and Rosenberg (2023) for n ≥ 1 leaves and g ≥ 0 galls.We first recursively enumerate all such rooted binary unlabeled galled trees with a specified number of leaves n, considering all possible numbers of Fig. 1 Rooted galled trees.A A rooted galled tree.In our definition of rooted galled trees, this example is the smallest network that possesses a gall.The gall is a root gall; node 1 is r , the top node; node 2 is the left hybridizing side node; node 4 is the right hybridizing side node; finally, node 3 is a r , the hybrid node.We depict the hybridizing side nodes and the hybrid node in a horizontal line, representing simultaneity of these nodes in the embedding of the rooted galled tree in time, proceeding from the top to the bottom of the diagram.B A network that does not satisfy our definition of a rooted galled tree, but that does qualify according to some definitions.This network is missing a hybrid node; each gall in our definition possesses at least four nodes.C A more complex rooted galled tree by our definition.D A more complex network that is not a rooted galled tree by our definition, because the red triangle lacks a hybrid node galls.We then refine this enumeration by subdividing it according to specified numbers of leaves n and galls g, considering all possible values of g for a fixed n.

Definitions
We follow Mathur and Rosenberg (2023) in describing key concepts, assuming that all networks and trees are binary (and henceforth dropping the term binary).A rooted phylogenetic network is a directed acyclic graph with four properties: (i) there exists a unique root node with in-degree 0 and out-degree 2; (ii) all leaf nodes have in-degree 1 and out-degree 0; (iii) all non-leaf, non-root nodes have in-degree 2 and out-degree 1 or in-degree 1 and out-degree 2; and (iv) all edges are directed away from the root.Nodes with in-degree 2 and out-degree 1 are termed reticulation nodes, and nodes with in-degree 1 and out-degree 2 are tree nodes.
A rooted galled tree is a rooted (binary) phylogenetic network in which two properties hold (Fig. 1).First, (i) each reticulation node a r has a unique ancestor node r such that exactly two non-overlapping paths of edges exist from r to a r ; if the direction of edges is ignored, then the two paths connecting r and a r form a cycle C r , known as a gall.Following Mathur and Rosenberg (2023), the ancestor node r must be separated from a r by at least two edges.This requirement that cycles contain at least four nodes is required by the perspective of Mathur and Rosenberg (2023) that views galled trees as evolving temporally by a biological process such as hybridization.It is equivalent to the requirement that a galled tree be a normal network and is not imposed in a more expansive galled tree definition that permits 3-node galls (Kong et al. 2022).
The second criterion is: (ii) the set of nodes in the gall C r , associated with reticulation node a r , and the set of nodes in the gall C s , associated with reticulation node a s = a r , are disjoint.
We term the ancestor node r of a gall with reticulation node a r and cycle C r the top node.Other nodes in a gall, excluding the top node and reticulation node, are called side nodes.We term the reticulation node a hybrid node, and the two immediate parents of a hybrid node hybridizing side nodes, or just hybridizing nodes.The side nodes to the left and right of the hybrid node are the left side nodes and right side nodes, respectively; the distinction between "left" and "right" is only for convenience, and a gall is invariant with respect to exchange of its left and right side nodes.If the root node of a rooted galled tree is part of a gall, then we call this gall the root gall.The root node is always a top node if it is part of a gall.
Although a rooted galled tree is only strictly a tree if it contains no galls, it is convenient to continue to refer to galled trees as trees; similarly, we allow "subtrees" to possess galls.All networks and trees that we consider are rooted, and we henceforth drop the term rooted.Mathur and Rosenberg (2023) focused on labeled galled trees, in which each leaf is associated with a distinct leaf label; here we consider unlabeled galled trees, and we often drop the term unlabeled.Unlike Mathur and Rosenberg (2023), we have no need to assign a temporal embedding to nodes, with the exception that ancestor nodes can be no more recent than their descendants; the galled trees that we consider are understood to be unordered.

Compositions
We will have occasion to consider the sums of ordered b-tuples of positive integers that equal a positive integer a: This result is obtained by noting that a list of a copies of the number 1 has a − 1 "breakpoints" between consecutive 1's, and, summing 1's between neighboring breakpoints, the compositions into b parts are produced by the distinct sets of b − 1 among the a − 1 breakpoint locations.For b > a, we define We distinguish between palindromic and non-palindromic compositions.Palindromic compositions are unchanged when the order of the parts is reversed; non-palindromic compositions do change when the order is reversed.For example, in C(9, 5), (1, 2, 3, 2, 1) is a palindromic composition; (1, 2, 2, 3, 1) is non-palindromic.We denote the palindromic compositions of a into b parts by C p (a, b) and the non-palindromic compositions of a into b parts by C np (a, b) The numbers of palindromic and non-palindromic compositions, |C p (a, b)| and |C np (a, b)|, appear in Table 1.Palindromic compositions are counted by counting the ways to place breakpoints on the "left" half of a list of 1's; breakpoints are then palindromically placed on the "right" half.Distinct cases exist depending on the parity of a and b.For non-palindromic compositions, we obtain

Unlabeled Trees with n Leaves
Our approach to enumerating unlabeled galled trees extends the Wedderburn-Etherington enumeration of unlabeled trees with no galls.For unlabeled trees with no galls, the root of a tree with n ≥ 2 leaves possesses two immediate subtrees.Assume without loss of generality that the number of leaves in the "left" subtree is greater than or equal to the number of leaves in the "right" subtree.If n ≥ 3 is odd, then U n , the number of unlabeled trees with n leaves, is obtained by considering the n−1 2 possible numbers of leaves k for the right subtree, for each k pairing all U k unlabeled trees with k leaves for the right subtree with all U n−k unlabeled trees with n − k leaves for the left subtree.If n is even, then the enumeration is similar for k = n 2 ; if k = n 2 , however, then we have U n/2 2 ways choosing two distinct subtrees for the left and right subtrees and U n/2 ways of choosing two copies of the same subtree.
The recursion for U n is [e.g.Harding (1971), Felsenstein (2004)]: 2 , even n. (1) With U n = 0, the generating function U(t) = n≥0 U n t n for the U n satisfies ( Comtet 1974, p. 55): The number of trees with no galls has exponential growth with d 0 ρ −n n − 3

The Maximum Number of Galls for Galled Trees with n Leaves
For a fixed number of leaves n, the number of galls that a galled tree can possess is constrained as a function of n.Because a gall contains at least three descendant subtrees-those descended from the two hybridizing nodes and the hybrid node-a minimum of n = 3 leaves is required before a tree can possess a gall.Each successive addition of a gall then replaces one subtree with a minimum of three subtrees-those descended from the two hybridizing nodes and the hybrid node of the new gall-so that each gall adds at least two leaves.It follows that a galled tree with n leaves can have at most n−1 2 galls (Mathur and Rosenberg 2023).

Unlabeled Galled Trees with n Leaves
We are now ready to enumerate unlabeled galled trees.We denote by A n the number of unlabeled galled trees with n leaves.Trees with n = 1 or n = 2 leaves have no galls: To recursively evaluate A n for n ≥ 3 leaves, we sum counts from two cases: (1) the root is not the top node of a gall; (2) the root is the top node of a gall.We count galled trees in the former case in B n , with B 1 = B 2 = 1, and we count galled trees in the latter case in D n , with D 1 = D 2 = 0.The goal is to evaluate (3)

Root is not a Top Node of a Gall
If the root is not the top node of a gall, then an unlabeled galled tree possesses two immediate subtrees of the root, each of which is itself an unlabeled galled tree (Fig. 2A).
The number of unlabeled galled trees then follows a recursion analogous to Eq. 1: It is convenient to express B n in a form that considers compositions c of n into 2 parts.For odd n ≥ 3, For even n,

Root is a Top Node of a Gall
If the root is a top node of a gall, then the recursion is more complex.We first count subtrees of the root gall, equal to the count of all side nodes plus the hybrid node.Suppose the root gall contains k subtrees.We have the constraint 3 ≤ k ≤ n, as a gall has at least 3 subtrees (of the left hybridizing node, hybrid node, and right hybridizing node), and the root gall can have as many as n subtrees, each ancestral to a single leaf.Without loss of generality, we can assume that the number of right side nodes in the root gall, r , is less than or equal to the number of left side nodes ; owing to the existence of the hybrid node, + 1 + r = k.We divide the case further based on the parity of k, writing where n and D

(o)
n count unlabeled trees with n leaves in which the root is a top node and the number of descendant subtrees of the root gall is even and odd, respectively.

Even Number of Subtrees of the Root Gall
We consider each even value of and k is even, we have the strict inequality r < .
Consider the k subtrees in an order that proceeds from the most ancestral left side node descending through subsequent left side nodes to the left hybridizing node, then to the hybrid node, then the right hybridizing node, and then through ancestors to the most ancestral right side node (Fig. 2B, C).Once k and r have been specified, we consider all possible ways of placing the n leaves into the k subtrees of the gall: the compositions , where c i is the value of the ith term, the number of galled trees is k i=1 A c i .We have The summand c∈C(n,2a) 2a i=1 A c i , representing the number of distinct lists of 2a subtrees with total number of leaves n, does not depend on r , the number of those subtrees descended from right side nodes.For n ≤ 2, a sum from a = 2 to a = n 2 is empty.We can therefore simplify Eq. 8 to obtain, for all n ≥ 1,

Odd Number of Subtrees of the Root Gall
If k is odd, then we consider k = 2a + 1 for a = 1, 2, . . ., n−1 2 .In this case, with r ≤ , we have 1 ≤ r ≤ a.With + 1 + r = k and k odd, r = is possible.
If r < , then r ranges from 1 to k−1 2 −1 = a −1 (Fig. 2D).We follow the reasoning of the case of even k and find that this case contributes a number of galled trees equal to Consider r = = a.For non-palindromic compositions of n into k parts representing the k subtrees of the gall, an equivalent unlabeled galled tree is obtained by a mirror-image composition that corresponds to an exchange of left and right subtrees of the root gall (Fig. 2E).We therefore multiply by 1 2 to account for the fact that each unlabeled galled tree is counted twice, so that the non-palindromic compositions of n contribute a number of galled trees equal to With r = = a, for a palindromic composition c of n into k parts representing the k subtrees of the root gall (Fig. 2F, G), we can select two distinct lists of galled subtrees for the a left and the a right subtrees; the number of ways to do so is . Alternatively, we can select the same lists of galled subtrees for the a left and a right subtrees, in a i=1 A c i ways.Any choices for the left and right subtrees can be combined with A c a+1 choices for the subtree of the hybrid node of the gall.The palindromic compositions produce a number of galled trees equal to Summing the three cases in Eqs. 10, 11, and 12, we obtain We can simplify this expression further.For a palindromic composition c ∈ C p (n, 2a+ 1), by definition of palindromic compositions, c i = c 2a+2−i for i = 1, 2, . . ., a. C(n, 2a + 1) is the disjoint union of C p (n, 2a + 1) and C np (n, 2a + 1).We can then write 1 2 Equation 13 then becomes Note that for n ≤ 2, a sum from a = 1 to n−1 2 is empty.Therefore, for all n ≥ 1,

Summary
To summarize the enumeration, the desired number of unlabeled galled trees with n leaves, A n , can be calculated in Eq. 3 by summing Eqs. 9 and 14 in Eq. 7, and then adding the result to B n from Eq. 4.
We simplify by writing the sum of D 2 as well.Recalling Eqs. 5 and 6, we can now write a simplified expression for the recursion for A n .First, A 1 = 1.If the number of leaves n of an unlabeled galled tree is an odd value n ≥ 3, then If the number of leaves n is even, then an extra term appears:

Example
To illustrate the recursive enumeration of rooted galled trees, we enumerate the A 5 = 20 rooted galled trees with 5 leaves.The base case in Eqs. 15 and 16 is A 1 = 1.To evaluate A 5 , we first evaluate A 2 , A 3 , and A 4 .

Trees with Two Leaves
Only one galled tree has two leaves: the 2-leaf tree with no galls (Table 2).Equation 16recovers this result: 2 = 0 (Eq.9), and

Trees with Three Leaves
For n = 3, there are two galled trees (Table 2).Using Eqs. 4, 9, and 14, we have Summing B 3 = 1, representing the tree with n = 3 leaves and no galls, D (e) 3 = 0, and D (o)  3 = 1 for the unique tree containing a gall, we have A 3 = 2.

Trees with Four Leaves
For n = 4, the number of galled trees is 6.We use Eqs. 4, 9, and 14 to obtain 4 counts the unique tree with k = 4 subtrees of the root gall, tree 6. D  Galled trees with different numbers of galls appear in different colors (0, black; 1, orange; 2, purple).For each number of leaves n, we enumerate galled trees in a canonical order.We recursively proceed through trees in which the root is not a top node of a gall, incrementing the number of leaves in the right subtree.Next, for trees in which the root is a top node, we proceed in increasing order of the number of subtrees of the root gall; for fixed numbers of subtrees, we proceed in dictionary order of ( , r ) values; for fixed ( , r ), we use reverse dictionary order of the compositions of leaves into subtrees of the gall.The canonical order is used in proceeding through subtrees of a fixed size

Trees with Five Leaves
We are now ready for the calculation of A 5 , which produces A 5 = 20 galled trees with five leaves (Table 2).
B 5 enumerates trees in which the root is not a top node of a gall, trees 1 to 8 for n = 5 in Table 2. D

Unlabeled Galled Trees with n Leaves and g Galls
A salient feature of a galled tree is its number of galls.Having enumerated unlabeled galled trees with n leaves, we now proceed to subdivide the calculation according to the number of galls: the number of galled trees with n leaves is a sum over g from 0 to n−1 2 of the number of galled trees with n leaves and g galls.We denote by E n,g the number of galled trees with n leaves and g galls.Because the maximum number of galls with n leaves is n−1 2 (Sect.2.4), we define E n,g = 0 for g > n−1 2 .As the unique galled tree with n = 1 leaf has no galls, the base case is E 1,0 = 1.
Again we separate two cases: (1) the root is not the top node of a gall, and (2) the root is the top node of a gall.For the former case, we denote the count by P n,g , with P 1,0 = P 2,0 = 1.For the latter case, we denote the count by R n,g , with R 1,g = R 2,g = 0 for all g.We seek to obtain E n,g = P n,g + R n,g .We use reasoning that parallels the case in which we do not keep track of the number of galls (Sect.3).
Note that when summing over all possible values of g, for n ≥ 1, we have

Root is not a Top Node of a Gall
If the root is not a top node, then a tree with n ≥ 2 leaves and g galls can be decomposed into two subtrees.We assign one of these trees m leaves, 1 ≤ m ≤ n 2 , and h galls, 0 ≤ h ≤ min(g, m−1 2 ).If n and g are both even, then it is possible for the two subtrees to be identical.Similarly to Eq. 4, we have even n and odd g ≥ 1, even n and even g.
Note that in this equation, we can replace min(g, m−1

2
) with g; for m−1 2 < h ≤ g, E m,h in the summand is zero, as a tree with m leaves has at most m−1 2 galls.We write another expression for P n,g by considering compositions of n into two parts representing the numbers of leaves in the left and right subtrees.We also decom-pose g; because entries in a composition are strictly positive, we consider compositions of g + 2, noting that each entry of the composition exceeds the associated number of galls by 1.
For (n, g) in which n or g is odd and n ≥ 2, similarly to Eq. 5, For (n, g) both even, as in Eq. 6,

Root is a Top Node of a Gall
In the case of a root gall, we distribute among the subtrees of the root gall g − 1 galls, as one of the g galls is the root gall.We again distinguish between even and odd numbers of subtrees of the root gall, writing n,g gives the number of trees with n nodes and g galls in which the root is a top node and the root gall has an even number of descendant subtrees, and R (o) n,g gives the corresponding number of trees with an odd number of descendant subtrees of the root gall.We follow our reasoning of Sects.3.2.1 and 3.2.2.

Even Number of Subtrees of the Root Gall
Suppose the number of the subtrees of the root gall is even, k = 2a, a = 2, 3, . . ., n 2 .As in Sect.3.2.1,given k, the number of right side nodes of the root gall, r , ranges from 1 to k 2 − 1 = a − 1.Here, however, we consider all ways of distributing g − 1 galls across k = 2a subtrees.Just as n leaves are placed into 2a subtrees by a composition of n into 2a parts, g − 1 galls are placed into 2a subtrees by a composition of g − 1 + 2a into 2a parts.By decomposing g − 1 + 2a, we allow for the possibility of 0 galls in a subtree; in a composition d of g − 1 + 2a, the number of galls in entry d i is d i − 1.
For all (n, g) with n ≥ 1 and 0 ≤ g ≤ n−1 2 , the resulting number of trees is similar to Eq. 9:

Odd Number of Subtrees of the Root Gall
For an odd number of subtrees of the root gall k = 2a + 1, a = 1, 2, . . ., n−1 2 , as in Sect.3.2.2,r ranges from 1 to a. Again we consider ( , r ) with r < (as in Eq. 10) and add the r = case (as in Eqs.11 and 12).
If r < , then similarly to the case of even k, the number of galled trees with k subtrees descended from a root gall of a tree with n ≥ 1 leaves and 0 ≤ g If r = = a, then we again distinguish between non-palindromic and palindromic compositions of n leaves into the k subtrees.Non-palindromic compositions do not result in symmetric trees, irrespective of the way the galls are placed across the subtrees.Therefore, considering only the non-palindromic compositions, similarly to Eq. 11, the number of galled trees with n ≥ 1 leaves and g ≥ 0 galls is Finally, for the palindromic compositions of n leaves with k odd and r = = a, we distinguish between cases with palindromic and non-palindromic compositions describing the placement of the g galls across k subtrees.For the non-palindromic compositions, similarly to Eq. 22, the number of trees is If both the composition of n leaves and the composition of g − 1 galls are palindromic, then, as in our reasoning for Eq. 12, we can choose either two distinct or two identical lists of subtrees for the a left subtrees and the a right subtrees, and the number of trees is We sum Eqs.21, 22, 23, and 24 to obtain

Summary
As E n,g = P n,g + R n,g , we summarize by adding Eqs.17, 20, and 25.E 1,0 = 1 and E 1,g = 0 for g ≥ 1.For (n, g) with n ≥ 2 leaves and 0 ≤ g ≤ n−1 2 galls, if n is odd, g is odd, or both n and g are odd, then If both n and g are even, then an extra term appears:

Example: 1 Gall
After the galled trees with no galls (Sect.2.3), the next simplest case for enumeration of galled trees is the galled trees with only one gall.For this case, if there is no root gall, then the one gall must be in exactly one of the two subtrees descended from the root.The other subtree is a tree with no galls.If a root gall is present, then there are no other galls, and all subtrees descended from the root gall are trees with no galls.
For n = 1, E 1,1 = 0.For n ≥ 2, using the odd case Eq. 26, the first term of Eq. 26 when g = 1 is The second term is The third term is Summing the three terms, the number of galled trees with n ≥ 2 leaves and g = 1 gall is:

Generating Functions
We now derive and analyze generating functions for A n , the number of galled trees with n leaves, and E n,1 , the number of galled trees with n leaves and 1 gall.We also show that the exponential growth of A n proceeds faster with n than the exponential growth of U n , the number of trees without galls-but that E n,1 and U n follow the same exponential growth.
To analyze the generating functions, we will need the values of A n for small n and E n,g for small n and g.Hence, we use our recursions to exhaustively calculate the number of galled trees with n leaves, A n (Eqs.15 and 16), and the number of galled trees with n leaves and 0 ≤ g ≤ n−1 2 galls, E n,g (Eqs.26 and 27).Considering n from 1 to 18, the numerical values appear in Table 3.

Generating Function for A n
Define a generating function A(t) = n≥0 A n t n , We rewrite Eqs. 15 and 16 in a single equation.To do so, we note A 1 = 1 and define A 0 = 0 and A n = 0 for non-integer values of n.For n ≥ 2, we then have We write the terms of the generating function with three components: The first term has the form of twice the generating function for the Wedderburn-Etherington numbers (Eq.2): For the second term, . . .
The step in Eq. 32 makes use of A 0 = 0. Equation 33 is obtained if and only if the sum in the previous step converges; that is, if and only if |A(t)| < 1.Finally, for the third term, . . .
Equation 34 holds because A 0 = 0, and the last equality holds if and only if |A(t 2 )| < 1.

Growth of A n
We now address the asymptotic growth of A n .In particular, we show that the number of galled trees grows exponentially faster in the number of leaves n than the corresponding number of trees without galls.First, note that the radius of convergence α is a positive constant less than 1.The convergence radius of generating function U(t) for the U n (Eq.2) is a value ρ ≈ 0.4027, and in particular, 0 < ρ < 1 (p.262 Landau 1977).Because A n > U n for all n ≥ 3, A(t) > U(t) for all 0 < t < ρ.Hence, we have α ≤ ρ < 1; in the Appendix, we show α > 0.
To find the asymptotic growth from the generating function for galled trees, A(t), we use the asymptotics of implicit tree-like classes theorem (Meir andMoon 1989a, 1989b;Flajolet and Sedgewick 2009, pp. 467-468).This theorem describes the asymptotic growth of the coefficients of a generating function that is described implicitly, such as in Eq. 36.We write A(t) = φ t, A(t) , and we denote A(t) = w.
To use the theorem, we must first show that the function A(t), defined by φ(t, w) = n,k s n,k t n w k , belongs to the smooth implicit-function schema.Indeed, the necessary conditions are satisfied: 1. φ is analytic in t and w around 0 from Eq. 36 and the positive convergence radius α > 0. 2. A 0 = 0. 3. A n ≥ 0 for n ≥ 1. 4. s 0,1 = 1, which is verified by noting that the t 0 w 1 term in the right-hand side of Eq. 36 is equal to A m (t 2 ) = 1. 5. s 0,0 = 0, which follows from φ(0, 0) = A(0) = 0, and s n,k ≥ 0, which is verified from the series expansion of Eq. 36.6.From Eq. 33, there exists a coefficient s n,k > 0 for n ≥ 0 and k ≥ 2: for example, s 0,2 = 1 2 − 1 + 1 2 2 = 1 2 .7. The last condition, which we show below, is that there are solutions α and w 0 for the characteristic system: According to the theorem, functions belonging to the smooth implicit-function schema converge at the solution to the characteristic system, where they possess a square-root singularity.We conclude that A(t) converges at α, with A(α) = w 0 , and that It remains to show condition (7).We can write φ(t, w) as: where Taking the derivative with respect to w, we have We do not know the value of A(α 2 ) that appears in g 2 (α).A(t) is monotonically increasing with t > 0; because α 2 is less than the radius of convergence α, A converges at α 2 and A(α 2 ) is a finite constant.As shown above, A(α 2 ) < 1.To find (α, w 0 ) numerically, we first note that Eq. 41 depends on t only through A(t 2 ).Hence, we can traverse values of y = A(t 2 ), numerically solving Eq. 38 for the single variable w in terms of y.Solutions for w must satisfy w > y, as w = A(t) > A(t 2 ) by the monotonicity of A(t).
Next, we see that Eq. 40 contains variables w, y, and t; using the pairs (w, y) obtained in the previous step, we numerically solve Eq. 37 for t in terms of w and y.In the third step, for each triple (t, w, y), we insert the value of t into the generating function 25 n=1 A n t 2n , where values A 1 , A 2 , . . ., A 25 are taken from Table 3; we retain triples with small |y − 25 n=1 A n t 2n |.Note that in this step, we could instead have retained triples with small |w − 25 n=1 A n t n |; faster convergence of 25 n=1 A n t 2n compared to 25 n=1 A n t n with a fixed number of known values of A n suggests that a more accurate result is obtained by use of y rather than w.
To calculate the asymptotic approximation to A n , we evaluate the constant δ.We have: We numerically evaluate the derivative A (α 2 ) from the first 25 terms by ) n ]/0.001.Inserting α ≈ 0.2073397 for t and A(α) ≈ 0.3550 for w, we have φ ww (α, w .1533 ≈ 0.2762 by Eq. 39, and

Generating Function for E n,1
We next find the generating function of E n,1 , E(t) = n≥0 e n t n , writing e n = E n,1 .
We define e 0 = 0, and recall that e 1 = 0 and that Eq. 28 applies for n ≥ 2. We then have for n ≥ 1 We can now write As in the derivation of A(t), we calculate the three parts separately.First, because e m = 0 for m = 0, 1, 2, For the second term, the derivation is identical to that of Eq. 33: Analogously to Eq. 33, Eq. 46 relies on a summation that can be completed if and only if |U(t)| < 1, that is, for |t| < ρ (Landau 1977, Eqs. 4 and 5).

Growth of E n,1
We now show that the asymptotic growth of the number of galled trees with one gall follows the asymptotic exponential growth of the number of trees with no galls.We also find the asymptotic approximation of E n,1 .
First, E(t) > U(t).From the form of Eq. 48, E(t) converges if and only if |U(t)| < 1.It is shown in Eqs. 4 and 5 of Landau (1977) that 0 < U(t) < 1 for 0 < t < ρ, with lim t→ρ − U(t) = 1.We conclude that E(t) has the same radius of convergence ρ as U(t).To find the asymptotic behavior of E(t), we notice that (Flajolet and Sedgewick 2009, pp. 476-477), 1 − U(t) → 0. Hence, the first of the two terms in Eq. 49 is the leading term as t → ρ − , producing At this point, we seek to use transfer theorems to transfer the asymptotic equivalence for E(t) to an asymptotic equivalence for its coefficients.To do so, we note that U(t) satisfies the technical criterion that it is -analytic at ρ-that is, it is analytic in a domain of particular shape around the singularity at ρ.The computation of E(t) from U(t) maintains the property that E(t) is -analytic with a singularity at ρ.
We can therefore use a transfer formula [Corollary VI.1, page 392 and Theorem VI.4, page 393 in Flajolet and Sedgewick (2009)], according to which, if f (t) isanalytic with a singularity at b, and . Using Eq. 50, we apply the transfer formula to E(t) with ρ in the role of b and 3 2 for a, noting E n,1 and U n have the same exponential growth.Whereas U n has subexponential term 0.3188n − 3 2 , however, E n,1 has larger subexponential term 0.3910n 1 2 .

Bivariate Generating Function for E n,g
We now find the bivariate generating function A(t, u) = n≥0 g≥0 E n,g t n u g .First, note that E 0,g = 0 for each g ≥ 0. For n = 1, E 1,0 = 1 and E 1,g = 0 for g ≥ 1.
From the recursion for E n,g (Eqs.26, 27), we get where E m, = 0 if at least one of (m, ) is not in N.
We can solve to find an expression for A(t, u) in a manner similar to the solution for A(t).For the second and third terms, we have k i=1 (d i − 1) = g − 1 and 2a+1 i=1 (d i − 1) = g − 1; in these terms, the gth gall is the root gall.Therefore, 123 In summary, inserting Eqs.53, 54, and 55 into Eq.52,

The Distribution of the Number of Galled Trees with a Fixed Number of Leaves
The bivariate generating function A(t, u) provides a basis for studying the distribution of the number of galls across galled trees with n leaves.The approach follows a theorem concerning asymptotic distributions in Theorem 2.23 of Drmota ( 2009) and Proposition IX.17 on p. 682 of (Flajolet and Sedgewick 2009).We use the form of the theorem quoted in Theorem 2 of Bouvel et al. (2020), who considered labeled galled trees.We conclude that for a fixed number of leaves, the number of galled trees as a function of the number of galls g is asymptotically normally distributed with mean and variance linear in n.
Following Bouvel et al. (2020), we consider a power series C(z, x) in two variables that is defined implicitly as the solution of C(z, x) = F z, x, C(z, x) , where F satisfies certain conditions.We suppose . Then X n is asymptotically normally distributed with a mean and variance that are linear multiples of n calculated from F.
In our scenario, t, u, and A play the roles of z, x, and C. A(t, u) is implicitly defined as a function of t, u, and A itself.With A(t, u) = n≥0 g≥0 E n,g t n u g , X n gives the random number of galls in a randomly selected galled tree with n leaves.Fixing the number of leaves n in A(t, u), this random variable satisfies To conclude that random variable X n -the random number of galls in a tree with n leaves-is normally distributed, it remains only to verify the conditions of the theorem.
Next, we have E n,g t n = n≥0 A n t n = A(t).
We then have ψ(t, 1, w) = φ(t, w).We have already shown conditions 5 and 6 in our analysis of function φ.With all the conditions demonstrated, we conclude that the random number of galls in a tree with n leaves is normally distributed.

Numerical Results
The numerical results for the number of galled trees with n leaves and the number of galled trees with 0 ≤ g ≤ n−1 2 galls suggest a number of simple observations (Table 3).First, for g = 0, we recover the Wedderburn-Etherington numbers obtained from Eq. 1.For n = 1 to 6, we obtain the values of A n and E n,g computed by Mathur and Rosenberg (2023).Finally, as g is bounded above by g max = n−1 2 , pairs of consecutive values of n, an odd then an even integer, have the same number of values of g for which the number of galled trees E n,g is nonzero, namely n+1 2 .Considering a fixed number of leaves n ≤ 18, we comment informally on the number of galled trees across different values of g.The number of trees with at least one gall is larger than the number without galls.As g increases for fixed n, the number of trees increases to a maximum, then declines.For values of n for which the maximal number of galls is even (n = 1, 2, 5, 6, 9, 10, 13, 14, 17, 18), the largest number of trees occurs when the number of galls is g max /2, half of this maximum.When g max is odd (n = 3, 4, 7, 8, 11, 12, 15, 16), the largest number of trees occurs at g = (g max − 1)/2 or g = (g max + 1)/2.
Figure 3 plots the number of trees for fixed n as a function of g, considering four consecutive values of n that represent the four cases possible for the parity of n and g max .The plots are somewhat symmetric; for n = 16 and 17, a neighboring value of g produces a number of trees close to the maximum, and for n = 15 and 18, the peak stands out more clearly.The patterns accord with the asymptotic normal distribution demonstrated for the number of galls as n increases (Sect.5.6).
Figure 4 examines the growth of E n,g on a logarithmic scale for different fixed values of g.The number of trees with no galls has exponential growth d 0 ρ −n n − 3 2 , for constants d 0 ≈ 0.3188 and 1/ρ ≈ 2.4833 (Sect.2.3).With one gall, E n,1 exceeds U n with growth d 1 ρ −n n 1 2 for d 1 ≈ 0.3910, but with the same exponential growth (Sect.5.4).With specified numbers of galls g ≥ 2, we see that growth of E n,g for fixed g appears to also follow an exponential trend.

Discussion
Building on the Wedderburn-Etherington recursion for enumerating rooted binary unlabeled trees with n leaves (Eq.1), we have introduced a recursion to enumerate rooted binary unlabeled (normal) galled trees with n leaves.The recursion follows the    spirit of the Wedderburn-Etherington formula in its recursive descent from the tree root-but with additional terms for cases in which the root of the tree is also the top node of a gall (Eqs.15 and 16).Continuing with a similar recursive strategy, we have also obtained a recursive formula for the number of galled trees with a fixed number of leaves n and a fixed number of galls g (Eqs.26 and 27).We have derived generating functions for the number of galled trees (Eq.36) and for the number of galled trees with 1 gall (Eq.48), analyzing their asymptotic behavior.
Our numerical calculations find that for small n, for a fixed number of galls g, the increase of the number of galled trees E n,g with n appears faster for larger values of the fixed number of galls g (Table 3, Fig. 4).Because E n,g = 0 for n < 2g + 1, for higher values of g, values of E n,g at small n do not reflect the asymptotic trend.Nevertheless, for g = 1, the initial apparent rapid growth of E n,g visible with increasing n moderates, in accord with the finding that the exponential order of the increase is the same as for the case of no galls (Sect.5.4).A similar moderation in growth with increasing n is just observable for E n,2 , which could hint at a similar exponential growth; we can conjecture that each E n,g with fixed g has the same exponential growth.Note that Fuchs et al. (2019, Theorem 5.1;2022, Theorem 1 and Corollary 2) showed that the exponential growth of labeled tree-child networks and normal networks with a fixed number of reticulation vertices (corresponding to a fixed number of galls in our case) is the same for any such number; only the subexponential growth differs.
On the other hand, when g is not restricted, we have shown in Sect.5.2 that the convergence radius of the generating function A(t) satisfies 0 < α < ρ < 1, so that A n grows with 0.0779(4.8230n )n − 3 2 .We also observed that the number of galled trees A n grows numerically faster with n than does the number of trees with no galls (Table 3).
For a fixed number of leaves n, the number of trees E n,g with a fixed number of galls increases to a maximum when the number of galls is at or near half the maximum number of galls n−1 2 , then decreases.This pattern accords with the normal distribution we expect as n increases (Sect.5.6).It is explained by the fact that many ways often exist to add a gall to a tree with a small number of galls without changing the number of lineages n (Fig. 5A).As the number of galls grows, fewer places are available in the tree to add more galls (Fig. 5B), and the number of possible trees declines.Informally, for a tree with n leaves, when we have a maximum of g max potential galls from which to choose, the binomial g max g , describing the number of possible subsets containing g galls, is highest for g near g max /2.
Galled trees provide a class of networks for use with biological processes such as admixture of populations, horizontal gene transfer, hybridization, and the recombination processes for which galled trees were originally introduced (Wang et al. 2001;Gusfield et al. 2004a).Other definitions of galled trees have previously been considered in enumerative problems (Semple and Steel 2006;Chang et al. 2018;Bouvel et al. 2020;Cardona and Zhang 2020); our definition, which requires galled trees to be "normal" by imposing a minimum of four nodes per gall, is designed for scenarios in which two lineages merge to form a new third lineage, but continue to have other descendants that are not descended from this merging event.Such scenarios are suited to phenomena such as admixture and hybridization, in which the merging process of two groups to form a third group has this feature: it does not cause the disappearance of the original two groups, which are free to produce additional descendants through processes that do not involve admixture and hybridization.
In related work, Cardona and Zhang (2020) enumerated rooted binary labeled normal galled trees.Their Theorem 8 finds that the number M n of such trees with n leaves is where C is the set of vectors (k 2 , k 3 , . . ., k n ) of nonnegative integers satisfying 1 + k 2 + 2k 3 + • • • + (n − 1)k n = n.This enumeration accords with our enumeration of the corresponding unlabeled normal galled trees.For n = 1 and 2, Eq.58 produces 1 rooted binary labeled normal galled tree; for n = 3, it gives 6 labeled trees-in accord with our count of 2 unlabeled normal galled trees, each of which has 3 possible labelings.For n = 4, Eq.58 gives 69 labeled trees; the 6 unlabeled normal galled trees in Table 2 have 12, 12, 3, 12, 6, and 24 labelings, respectively, summing to 69.
The enumeration of galled trees can assist in studies involving mixture processes in the same way that the Wedderburn-Etherington enumeration assists in evolutionary biology more generally, by describing the contents of a space of biologically relevant trees that must be traversed in a variety of algorithmic, combinatorial, probabilistic, and statistical problems [e.g. Harding (1971), Matsen and Evans (2012), Sievers et al. (2014), Colijn and Plazzotta (2018), Rosenberg (2021)].The study adds to the growing area of enumerative combinatorics of phylogenetic networks [e.g.Bouvel et al. (2020), Cardona andZhang (2020), Gunawan et al. (2020), Bienvenu et al. (2022), Fuchs et al. (2022)] and is one of relatively few studies to examine a class of unlabeled networks (Chang et al. 2018;Mathur and Rosenberg 2023).Further work can investigate the properties of E n,g for fixed g ≥ 2.
Cardona and Zhang (2020) (Eq.58).M n is in turn bounded above by the number of rooted labeled galled trees tabulated without imposing the normality requirement, a quantity of Bouvel et al. (2020) that we call Q n .Section 5 of Bouvel et al. (2020) showed that the exponential generating function Q(t) = n≥0 Q n t n /n! has positive radius of convergence r = 1 8 .To prove that α > 0, we note that the number of rooted labeled normal galled trees with n leaves is M n = A n i=1 M(T i ).Here, the sum proceeds over the A n rooted unlabeled normal galled trees, as each rooted labeled normal galled tree is obtained by placing a labeling on one of the rooted unlabeled normal galled trees.M(T i ) is the number of labelings of rooted unlabeled normal galled tree T i .
Next, consider the concept of a symmetric node of a rooted unlabeled galled tree, an internal node with two identical rooted unlabeled subtrees.The top node of a gall can be symmetric, but side nodes of a gall cannot, as one subtree of a side node contains the reticulation node and the other does not.A reticulation node also cannot be a symmetric node, as it has only one subtree.
For a rooted unlabeled normal galled tree, the number of distinct labelings is L(T i ) = n!/2 s i , where s i is the number of symmetric nodes of T i .To see why this result holds, consider a planar representation of T i , and examine all n! labelings of the n nodes.For each such labeling, for each symmetric node, a rotation of T i around the node generates a distinct labeling for the same labeled tree-so that each rooted labeled normal galled tree is obtained from 2 s i of the n! labelings.
The number of symmetric nodes is bounded above by the maximal number of internal nodes that are not side nodes or hybridization nodes.This number is n − 1, the number of internal nodes of a rooted tree with no galls; note that each gall adds two internal nodes to the tree, but neither of the "extra" nodes can be symmetric, as they include a side node and a reticulation node.
It is convenient to use n rather than n − 1 for the upper bound on the number of symmetric nodes.Then The generating function Q(t) has positive radius of convergence r = 1 8 , so that Q(2t) converges for |t| < 1 16 .Multiplying Eq. 59 by t n and summing over all n, we have |A(t)| < |Q(2t)| for 0 < |t| < 1 16 .Hence, the smaller A(t) must have positive radius of convergence α ≥ 1 16 .In particular, α > 0.

Fig. 2
Fig. 2 Recursive enumeration of rooted galled trees.Triangles indicate unspecified subtrees with at least one leaf.A The root is not a top node of a gall (Eq.4).B, C The root is a top node of a gall with an even number of subtrees (Eq.9).The two trees show the two cases with k = 6: ( , r ) = (4, 1) (B) and ( , r ) = (3, 2) (C).D The root is a top node of a gall with an odd number of subtrees and r < (Eq.10).In this case, k = 5 and ( , r ) = (3, 1).E The root is a top node of a gall with an odd number of subtrees, r = , and the composition of n leaves descended from the root gall into k = + r + 1 parts representing k subtrees is non-palindromic (Eq.11).In this case, k = 5 and ( , r ) = (2, 2).Different outline colors for triangles indicate different numbers of leaves in associated subtrees.F, G The root is a top node of a gall with an odd number of subtrees, r = , and the composition of n leaves into k parts is palindromic (Eq.12).In both trees, k = 5 and ( , r ) = (2, 2); trees with distinct (F) and identical (G) lists of galled subtrees for left and right subtrees are depicted.Different outline colors for triangles indicate different numbers of leaves in subtrees, and different patterns in the same color indicate different topologies with equally many leaves (Color figure online)

n
in one expression.When adding the even-k terms in D (e) n and the odd-k terms D (o) n , k now ranges from 3 to n, considering even values of k with k = 2a and a = 2, 3, . . ., n 2 , and odd values of k with k = 2a + 1 and a = 1, 2, . . ., n−1 2 .For k = 2a, a − 1 = k−2 2 , and for k

Fig. 3
Fig. 3 Number of galled trees as a function of the number of galls g, for fixed numbers of leaves n.A n = 15.B n = 16.C n = 17.D n = 18.Values are computed from Eqs. 26 and 27

Fig. 4 Fig. 5
Fig. 4 Number of galled trees as a function of the number of leaves n, for fixed numbers of galls g = 0, 1, 2, 3, 4, 5, 6.Values are computed from Eqs. 26 and 27.The y-axis appears on a logarithmic scale

Table 1
The number of palindromic compositions of a into b parts, 1 ≤ b ≤ a, and the corresponding number of non-palindromic compositions

Table 3
Numbers of galled trees with specified numbers of leaves and galls n